Skip to content

feat(py): generate middleware#5253

Open
huangjeff5 wants to merge 78 commits into
mainfrom
jh-mw
Open

feat(py): generate middleware#5253
huangjeff5 wants to merge 78 commits into
mainfrom
jh-mw

Conversation

@huangjeff5
Copy link
Copy Markdown
Contributor

@huangjeff5 huangjeff5 commented May 7, 2026

Python middleware for generate()use=[...]

Adds a middleware system for Python that lets you intercept and wrap generate() calls at three granularities: the full generate iteration, each model API call, and each tool execution. Replaces the old ModelMiddleware abstraction.

What it does

Middleware is applied per-generate() call via a use=[...] parameter. Each entry in the list wraps the call in a chain — first entry is outermost (runs first/last). Three hooks are available:

  • wrap_generate — wraps each iteration of the tool loop (model call + tool resolution). Runs once per agentic turn.
  • wrap_model — wraps each raw model API call. Use for retry, fallback, logging latency.
  • wrap_tool — wraps each individual tool execution. Use for approval gates, sandboxing, error enrichment.
  • tools() — contribute extra tools dynamically per generate() call (e.g. skills libraries, sandboxed filesystem ops). Tools are scoped to the call and don't pollute the root registry.

Defining and registering middleware inline (app developers)

Subclass BaseMiddleware (a Pydantic model) and decorate it using the @ai.middleware decorator on your Genkit instance. This automatically registers the middleware with the registry so that it is discoverable by the Dev UI and usable across your app:

import time

from genkit import Genkit
from genkit.middleware import BaseMiddleware, ModelHookParams
from genkit.plugins.middleware import Retry

ai = Genkit()

@ai.middleware(name='latency_logger', description='Logs model latency')
class LatencyLogger(BaseMiddleware):
    prefix: str = '[trace]'

    async def wrap_model(self, params: ModelHookParams, next_fn):
        t = time.monotonic()
        resp = await next_fn(params)
        print(f'{self.prefix} model call took {time.monotonic() - t:.3f}s')
        return resp

response = await ai.generate(
    model='googleai/gemini-flash-latest',
    prompt='Hello',
    use=[
        Retry(max_retries=5),
        LatencyLogger(prefix='[myapp]'),
    ],
)

Dev UI integration

Any middleware registered via plugins or decorated with @ai.middleware is automatically available in the Dev UI — no extra registration steps needed.

Once registered, the middleware shows up on the Model Runner page in the Dev UI, where you can mix-and-match middleware and set config values interactively. When you run a generate call from there, the Dev UI passes a MiddlewareRef — a name plus a config dict — into generate_action. The framework resolves that ref against the registry, instantiates the middleware class with the provided config (cls(**config)), and runs the chain exactly as it would inline.

Pre-packaging middleware through a plugin (plugin authors)

Use new_middleware to build a MiddlewareDesc from a BaseMiddleware subclass, then wrap them with middleware_plugin to produce a standard Plugin for plugins=[...]. Note that plugin middleware classes do not require decorators:

mylib/middleware.py (plugin author):

from genkit.middleware import BaseMiddleware, ModelHookParams
from genkit.plugin_api import new_middleware, middleware_plugin

class Retry(BaseMiddleware):
    max_retries: int = 3

    async def wrap_model(self, params: ModelHookParams, next_fn):
        for attempt in range(self.max_retries + 1):
            try:
                return await next_fn(params)
            except Exception:
                if attempt == self.max_retries:
                    raise

class Fallback(BaseMiddleware):
    models: list[str] = []

    async def wrap_model(self, params: ModelHookParams, next_fn):
        ...  # try primary, then each fallback model

def my_middleware_plugin():
    return middleware_plugin([
        new_middleware(Retry, name='retry', description='Retries model calls on transient failures'),
        new_middleware(Fallback, name='fallback'),
    ])

@github-actions github-actions Bot added docs Improvements or additions to documentation python Python config labels May 7, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a robust middleware system for Genkit, enabling interception and modification of model generation, API calls, and tool executions. It introduces the genkit-plugin-middleware package, providing standard middleware for retries, fallbacks, tool approval, skills, and filesystem operations. Core generation logic was refactored to handle middleware normalization and per-call scoping. Feedback identifies redundant model copies in the asynchronous generation methods and a design conflict between validation tests and the normalization implementation. Furthermore, the reviewer noted blocking synchronous I/O in asynchronous tool implementations and issues with the jitter calculation in the retry middleware that could cause delays to exceed configured maximums.

Comment thread py/packages/genkit/src/genkit/_ai/_aio.py Outdated
Comment thread py/packages/genkit/src/genkit/_ai/_aio.py Outdated
Comment thread py/packages/genkit/tests/genkit/ai/generate_test.py Outdated
Comment thread py/plugins/middleware/src/genkit/plugins/middleware/_filesystem.py Outdated
Comment thread py/plugins/middleware/src/genkit/plugins/middleware/_retry.py Outdated
Comment thread py/plugins/middleware/src/genkit/plugins/middleware/_skills.py Outdated
Comment thread py/packages/genkit/src/genkit/_core/_plugin.py Outdated
Comment thread py/packages/genkit/src/genkit/_ai/_generate.py Outdated
Comment thread py/packages/genkit/src/genkit/_ai/_generate.py
Comment thread py/packages/genkit/tests/genkit/veneer/veneer_test.py Outdated
Comment thread py/packages/genkit/src/genkit/_core/_middleware.py Outdated
Comment thread py/packages/genkit/tests/genkit/core/reflection_v2_test.py Outdated
@huangjeff5 huangjeff5 requested review from apascal07 and pavelgj May 19, 2026 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config docs Improvements or additions to documentation python Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant